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L Introduction 

The goal of this woik b to design and evaluate a new 
architecture called DRALrC-OJstributed RAID And 
Location independence Caching. DRALIC provides a 
direct and immedtate solution to boost web server 
perfonnance by maktog use of commodity computers 
that are available tod^. DRALIC starts working 
only when an actual disk request has come to the 
devico no matter whether it is a result of file system 
miss or it is a request from a database operation. It 
does not requ^e any change of existing operatmg 
systems, databases, nor applications. In one 
implementation, DRAUC combines the fimctions of 
disk VO host bus adapter card (HBA) and the 
fimctions of the network interfece card QUQ to Com 
an integrated I/O-Network card with a highly 
intelligent embedded-processor. Or in another 
Implementation. DRALIC bridges the HBA and NIC 
by deslgnmg intelligent device drivers. Besides 
network accesses, the new mterfece card or driveis at 
each node control the local disk as well as a raw 
RAM partition of the system RAM of the node. The 
disk together with the ones m other nodes in tiie 
network forms a distributed RAID that appears to 
users as a large and reliable logic disk spacei TTio taw 
RAM partitions in all nodes together fomi a laree, 
global, and location independence cache for the 
RAID and is accessible to any node connected to the 
network, independent of its physical location. 
Therefore, DRAUC works at device or device driver 
level to allow ail the nodes to work together in 
parallel to process web requests. The distributed 
RAID allows parallel operations of disk accesses md 
provides feult tolerance using parity disks, whereas 
location independence caches provide cooperative 
caching to the computing nodes for better UO 
performance. Furthermore, DRALIC is a cost- 
effective architectural approach because it uses low 
cost PCs/Workstations that are often readily available 
as existing computing facilities m an organhcalion or 
oooperadon. 

n. DRALIC Architecturo 

The main idea of DRALIC is very shnple. It 
combmes or bridges disk I/O host bus adapter card 



(HBA) and network interface card (NIC) to 
implement distributed RAID and global caching. 
Figure I shows the conceptual diagram of a 
DRALIC A disk that exists hi a PC/Workstation 
(node) is partitioned into two parts: one local disk 
that holds OS and local data and applications, and the 
other called DRALIC disk that is used by DRALIC. 
DRALIC disks hi all nodes in the system are 
interconnected through the DRAUC controller and a 
networic switch to form a distributed RAID. The 
system RAM in each node is also partitioned into two 
parts: one is controlled by local OS and die other, 
referred to as DRAUC RAM. is controlled by Ae 
DRALIC driver. The collection of DRALIC RAM ui 
all nodes forms a unified system cache for the 
underiyhig RAID system. 




IIL PreliminaiyPerformanee Analysis 

To demonstrate the feasibility and performance 
potentbi of the proposed DRALIC, we present a 
prelhnmaiy performance analysis to look at the 
effects of bus and networic delays on the pcrfiMmance 
potential of the DRALIC architecture. While our 
research mli focus on System I/O, the current PCI 
bus can run at 33-132 MHz wife data width of 32 or 
64 bits. The memory bandwidth of PCI based system 
is BWo«,='33M*32bits/sfic==132MB/sec. A Gigabit 
Ethernet switch with the transfer speed up to IGbps 
can provide networic bandwidth approximately: 
BW„rf-lOOMB/s. TTie overiiead of networic operation 
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including both software and hardware is assumed to 
be OHoerO^ins. As for disks, we consider a typical 
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Based on the above disk parameters, we can assume 
that a typical bandwidth of disk to be BW^u^^ZSMB/s 
and the overiiead of disk to be OHdsk*=i2ms. The 
following is a list of notations and formulae used ui 
our analysis: 

B: data btock size (8KB); 

N: number of nodes wiAm tiie DRAUC system; 

Ha: Local memoiy hit tatfo; 

IW Remote memoiy fut ratio; 

Tfaa: Local memoiy access time (ms); 

Tm: Remote memoiy access time (ms); 
. Tpu-' access tune fiom the distributed RAID (ms); 

Tp.: Average 1/0 response time of traditional PCs 

vfiOi no coopeiative cacfahig(ms); 

Tdniie: Average I/O response time of DRALIC 

system (ms^ 
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Figuie 2: Remote cache miss ratio 



With lack of measured Wt ratios of remote caches, we 
assume remote hit ratio to be a iogarithm function of 
number of nodes hi the system as shown in Figure 2. 



It is reasonable to assume that the remote cache hit 
ratio increases with the number of nodes because 
more nodes give larger cooperative cache spaces. The 
exact hit ratio number is not significant here since we 
use the hit ratio as a changing parameter to observe 
I/O performance as a function of it From Figure 3, 
we can see that even with hit ratio of 50%, 
performance is doubled. With. remote hit ratio of 
80%, a &ctor of 4 performance improvement can be 
obtained. The data in this figure are sufficient to 
show the potential benefits of DRAUC 



DRAUC: Nodos Innitence 

1 mmuftS-^HImsCLfll 



l!Jil, T g.M l:^.:.^.^k^,^ii K E^.!^ l.l.V^;.^ ^J^Ll 



F^gute 3: Average I/O response thne vs. number of nodes 
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